{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Text Classification workflow with `arcgis.learn`"]}, {"cell_type": "markdown", "metadata": {}, "source": ["<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n", "\n", "<div class=\"toc\">\n", "<ul class=\"toc-item\">\n", "<li><span><a href=\"#Introduction\" data-toc-modified-id=\"Introduction-1\">Introduction</a></span></li>\n", "<li><span><a href=\"#Prerequisites\" data-toc-modified-id=\"Prerequisites-2\">Prerequisites</a></span></li>\n", "<li><span><a href=\"#Transformer-Basics\" data-toc-modified-id=\"Transformer-Basics-3\">Transformer Basics</a></span></li>\n", "    \n", "<li><span><a href=\"#Data-preparation\" data-toc-modified-id=\"Data-preparation-4\">Data preparation</a></span></li>\n", "<li><span><a href=\"#TextClassifier-model\" data-toc-modified-id=\"TextClassifier-model-5\">TextClassifier model</a></span></li>\n", "<ul class=\"toc-item\">\n", "<li><span><a href=\"#How-to-choose-an-appropriate-model-for-your-dataset?\" data-toc-modified-id=\"How-to-choose-an-appropriate-model-for-your-dataset?-5.1\">How to choose an appropriate model for your dataset?</a></span>    \n", "<li><span><a href=\"#Model-training\" data-toc-modified-id=\"Model-training-5.2\">Model training</a></span>\n", "    <ul class=\"toc-item\">\n", "        <li><span><a href=\"#Finding-optimum-learning-rate\" data-toc-modified-id=\"Finding-optimum-learning-rate-5.2.1\">Finding optimum learning rate</a></span>\n", "        <li><span><a href=\"#Evaluate-model-performance\" data-toc-modified-id=\"Evaluate-model-performance-5.2.2\">Evaluate model performance</a></span>\n", "        <li><span><a href=\"#Validate-results\" data-toc-modified-id=\"Validate-results-5.2.3\">Validate results</a></span></li> \n", "    </ul>\n", "</ul>\n", "<li><span><a href=\"#Model-inference\" data-toc-modified-id=\"Model-inference-6\">Model inference</a></span></li>\n", "<li><span><a href=\"#References\" data-toc-modified-id=\"References-7\">References</a></span></li>\n", "</ul>\n", "</div>"]}, {"cell_type": "markdown", "metadata": {}, "source": ["# Introduction\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Text classification also known as text tagging or text categorization is the process of assigning tags/labels to unstructured text. Using Natural Language Processing (NLP), text classifiers can automatically analyze text and then assign a set of pre-defined tags or categories based on its content. \n", "\n", "As with any other classification problem, text classification can be broadly divided into 2 different categories:\n", "\n", "- **Multi-class single-label text classification**\n", "- **Multi-class multi-label text classification**\n", "\n", "### Multi-class single-label text classification\n", "The set of problems where one can associate only a single label to a given input text falls into this category. Take an example of a house address. The address can be associated with a single country. Hence classifying/ tagging a house address to a country is an example of multi-class single-label text classification problem. Other examples include:\n", "- **Sentiment Analysis** on tweets/movie reviews.\n", "- Classifying emails as **Spam vs not Spam**\n", "- **Language detection** from text\n", "\n", "### Multi-class multi-label text classification\n", "The set of problems where one can associate multiple labels to a given input text falls into this category. Take an example where we are moderating a social media platform by flagging inappropriate user comments and posts. An inappropriate post can fall into multiple categories like toxic, threat, insult, obscene etc. Other examples include:\n", "- **Analyze customer support tickets** to quickly assign appropriate categories.\n", "- **Categorization of News Articles** into appropriate topics.\n", "\n", "The `TextClassifier` class in `arcgis.learn.text` module is based on [Hugging Face Transformers](https://huggingface.co/transformers/v3.0.2/index.html) library. This library provides transformer models like BERT, RoBERTa, XLM, DistilBert, XLNet etc., for **Natural Language Understanding (NLU)** with over 32+ pretrained models in 100+ languages.\n", "\n", "The transformers are the most latest and advanced models that give the state of the art results for a wide range of tasks such as **text / sequence classification**, **named entity recognition (ner)**, **question answering**, **machine translation**, **text summarization**, **text generation**, etc."]}, {"cell_type": "markdown", "metadata": {}, "source": ["# Prerequisites"]}, {"cell_type": "markdown", "metadata": {}, "source": ["- Data preparation and model training workflows for text classification using `arcgis.learn.text` is based on [Hugging Face Transformers](https://huggingface.co/transformers/v3.0.2/index.html) library. A user can choose an appropriate architecture to train the model.\n", "- Refer to the section [Install deep learning dependencies of arcgis.learn module](https://developers.arcgis.com/python/guide/install-and-set-up/#Install-deep-learning-dependencies) for detailed explanation about deep learning dependencies.\n", "- **Labeled data**: For `TextClassifier` to learn, it needs to see examples that have been labeled for all the custom categories that the model is expected to classify an input text into. Head to the **Data preparation** section to see the supported formats for training data."]}, {"cell_type": "markdown", "metadata": {}, "source": ["# Transformer Basics"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Transformers in NLP are novel architectures that aims to solve [sequence-to-sequence](https://towardsdatascience.com/understanding-encoder-decoder-sequence-to-sequence-model-679e04af4346) tasks while handling [long-range dependencies](https://medium.com/tech-break/recurrent-neural-network-and-long-term-dependencies-e21773defd92) with ease. The Transformer was proposed in the paper [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf). A transformer consists of an encoding component, a decoding component, and connections between them.\n", "\n", "<img src=\"\">\n", "\n", "<center>Figure1: A high-level view depicting components of a Transformer [1]</center>\n", "\n", "- The **Encoding component** is a stack of encoders (the paper stacks six of them on top of each other). \n", "- The **Decoding component** is a stack of decoders of the same number. \n", "\n", "The encoders are all identical in structure (yet they do not share weights). Each one is broken down into two sub-layers:\n", "- **Self-Attention Layer**\n", "  - Say the following sentence is an input sentence we want to translate:\n", "  \n", "    **`The animal didn't cross the street because it was too tired`**\n", "    \n", "    What does **\"it\"** in this sentence refer to? Is it referring to the **street** or to the **animal**? It's a simple question to a human, but not as simple to an algorithm. When the model is processing the word **\"it\"**, self-attention allow the model to associate **\"it\"** with **\"animal\"**.\n", "\n", "- **Feed Forward Layer** - The outputs of the self-attention layer are fed to a feed-forward neural network. \n", "\n", "The decoder has both those layers (**self-attention** & **feed forward layer**), but between them is an **attention layer** (sometimes called **encoder-decoder attention**) that helps the decoder focus on relevant parts of the input sentence.\n", "\n", "<img src=\"\">\n", "\n", "<center>Figure2: Different Layers in Transformer's Encoder & Decoder component[1]</center>\n", "\n", "To get a more detailed explanation on how **attention**[[2]](#References) mechanism works in transformer models visit [this page](https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/). \n", "\n", "An **\u201cannotated\u201d**[[3]](#References) version of the paper is also present in the form of a line-by-line implementation of the transformer architecture."]}, {"cell_type": "markdown", "metadata": {}, "source": ["# Data preparation"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The `TextClassifier` class in `arcgis.learn.text` module can consume labeled training data in [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) or [TSV](https://en.wikipedia.org/wiki/Tab-separated_values) file format\n", "\n", "There is a slight variation in the way the input data is created for \n", "- **Multi-class single-label text classification**\n", "- **Multi-class multi-label text classification**\n", "\n", "Sample input data format for **Multi-class single-label text classification** problem\n", "\n", "<img width=\"700\" src=\"\">\n", "\n", "Sample input data format for **Multi-class multi-label text classification** problem\n", "\n", "<img src=\"\">"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The main difference is that in a **Multi-class single-label text classification** problem, we have a single target column, but in a **Multi-class multi-label text classification** problem we have multiple target columns to train the model. The class values are **binary(0/1)**, where the value of **1** represents the presence of a particular class/label for the given training sample and **0** represents the absence of it. In the sample shown above a text can be assigned into 6 different categories **`toxic`**, **`severe_toxic`**, **`obscene`**, **`threat`**, **`insult`** and **`identity_hate`**. A  column value of **1** (see row **#3**) means that the comment/text is labeled as the column name (**toxic** in this case)."]}, {"cell_type": "markdown", "metadata": {}, "source": ["**Data preparation** involves splitting the data into training and validation sets, creating the necessary data structures for loading data into the model and so on. The `prepare_textdata` function can directly read the training samples in one of the above specified formats and automate the entire process. While calling this function, the user has to provide the following arguments:\n", "- **path**&nbsp;&ensp;&emsp;&emsp;&emsp;&emsp;- &emsp;&emsp;&emsp;&emsp;        The **full directory path** where the **training file** is present\n", "- **task**&nbsp;&ensp;&emsp;&nbsp;&emsp;&emsp;&emsp;- &emsp;&emsp;&emsp;&emsp;       The **task** for which the **dataset** is being prepared. The available choice at this point is **\"classification\"**\n", "- **train_file**&nbsp;&ensp;&emsp;&nbsp;&ensp;&nbsp;-  &emsp;&emsp;&emsp;&emsp; The file name containing the **training data**. Supported file formats/extensions are **.csv** and **.tsv**\n", "- **text_columns**&nbsp;&nbsp;-  &emsp;&emsp;&emsp;&emsp; The column name in the csv/tsv file that will be used as **feature**.\n", "- **label_columns** -  &emsp;&emsp;&emsp;&emsp;The list of columns denoting the class label to predict. Provide a list of columns in case of a multi-label classification problem\n", "\n", "Some pre-processing functions are also provided like removing [HTML tags](https://html.com/tags/) from the text or removing the [URLs](https://en.wikipedia.org/wiki/URL) from the text. Users can decide if these pre-processing steps are required for their dataset or not.\n", "\n", "**A note on the dataset**\n", "- The data is collected around 2020-05-27 by [OpenAddresses](http://openaddresses.io).\n", "- The data licenses can be found in `data/country-classifier/LICENSE.txt`."]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": ["import pandas as pd\n", "from arcgis.learn import prepare_textdata\n", "from arcgis.learn.text import TextClassifier"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": ["DATA_ROOT = \"data/country_classifier/\""]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": ["data = prepare_textdata(DATA_ROOT, \"classification\", train_file=\"house-addresses.csv\", \n", "                        text_columns=\"Address\", label_columns=\"Country\", batch_size=64)"]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["['US', 'BE', 'AU', 'ZA', 'CA', 'BR', 'MX', 'FR', 'JP', 'ES']\n"]}], "source": ["print(data.classes)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The `show_batch()` method can be used to visualize the training samples, along with labels."]}, {"cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [{"data": {"text/html": ["<style  type=\"text/css\" >\n", "    #T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070 th {\n", "          text-align: left;\n", "    }#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row0_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row0_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row1_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row1_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row2_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row2_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row3_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row3_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row4_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row4_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row5_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row5_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row6_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row6_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row7_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row7_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row8_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row8_col1,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row9_col0,#T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row9_col1{\n", "            text-align:  left;\n", "        }</style><table id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070\" ><thead>    <tr>        <th class=\"col_heading level0 col0\" >Address</th>        <th class=\"col_heading level0 col1\" >Country</th>    </tr></thead><tbody>\n", "                <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row0_col0\" class=\"data row0 col0\" >10, Place Cockerill, 0051, 4000</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row0_col1\" class=\"data row0 col1\" >BE</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row1_col0\" class=\"data row1 col0\" >547, RUA  DIRCEU LOPES, CASA, Pedro Leopoldo, MG, 33600-000</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row1_col1\" class=\"data row1 col1\" >BR</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row2_col0\" class=\"data row2 col0\" >2, Rue de Ker Izella, Botsorhel, 29650</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row2_col1\" class=\"data row2 col1\" >FR</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row3_col0\" class=\"data row3 col0\" >168, RUA CORONEL MOREIRA CESAR, APARTAMENTO 402, Niter\u00f3i, RJ, 24230-062</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row3_col1\" class=\"data row3 col1\" >BR</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row4_col0\" class=\"data row4 col0\" >732P, CL ARENAL, 33740</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row4_col1\" class=\"data row4 col1\" >ES</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row5_col0\" class=\"data row5 col0\" >17-9, \u9ad8\u67f3\u65b0\u7530</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row5_col1\" class=\"data row5 col1\" >JP</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row6_col0\" class=\"data row6 col0\" >S/N, CALLE VENUSTIANO CARRANZA, NICOL\u00c1S BRAVO, Oth\u00f3n P. Blanco, Quintana Roo</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row6_col1\" class=\"data row6 col1\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row7_col0\" class=\"data row7 col0\" >SN, CALLE FRONTERA, MAZATL\u00c1N, Mazatl\u00e1n, Sinaloa</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row7_col1\" class=\"data row7 col1\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row8_col0\" class=\"data row8 col0\" >41, Oostmallebaan, Zoersel, 2980</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row8_col1\" class=\"data row8 col1\" >BE</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row9_col0\" class=\"data row9 col0\" >SN, RUA  ENGENHO PROPRIEDADE, Sirinha\u00e9m, PE, 55580-000</td>\n", "                        <td id=\"T_2c6b27c0_33b6_11eb_a969_a4bb6dafa070row9_col1\" class=\"data row9 col1\" >BR</td>\n", "            </tr>\n", "    </tbody></table>"], "text/plain": ["<pandas.io.formats.style.Styler at 0x154fd743708>"]}, "execution_count": 6, "metadata": {}, "output_type": "execute_result"}], "source": ["data.show_batch(rows=10)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["# TextClassifier model"]}, {"cell_type": "markdown", "metadata": {}, "source": ["`TextClassifier` model in `arcgis.learn.text` is built on top of [Hugging Face Transformers](https://huggingface.co/transformers/v3.0.2/index.html) library. The model training and inferencing workflow are similar to computer vision models in `arcgis.learn`. \n", "\n", "Run the command below to see which transformer backbones are supported for the classification task."]}, {"cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["['BERT', 'RoBERTa', 'DistilBERT', 'ALBERT', 'FlauBERT', 'CamemBERT', 'XLNet', 'XLM', 'XLM-RoBERTa', 'Bart', 'ELECTRA', 'Longformer', 'MobileBERT']\n"]}], "source": ["print(TextClassifier.supported_backbones)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## How to choose an appropriate model for your dataset?\n", "\n", "[This page](https://huggingface.co/transformers/v3.0.2/pretrained_models.html) mentions different **transformers** architectures [[4]](#References) which come in different sizes (model parameters), trained on different languages /corpus, having different attention heads, etc. Not every model can be used for `text classification` purpose. As of now, there are around 13 models that can be used to perform `text classification`. These are `BERT`[[5]](#References), `RoBERTa`, `DistilBERT`, `ALBERT`, `FlauBERT`, `CamemBERT`, `XLNet`, `XLM`, `XLM-RoBERTa`, `Bart`, `ELECTRA`, `Longformer` and `MobileBERT`\n", "\n", "\n", "Some consideration has to be made to pick the right transformer architecture for the problem at hand. \n", "- Some models like `BERT`, `RoBERTa`, `XLNET`, `XLM-RoBERTa` are highly accurate but at the same time are larger in size. Generating inference from these models is somewhat slow.\n", "- If one wishes to sacrifice a little accuracy over a high inferencing and training speed one can go with `DistilBERT`.\n", "- If the model size is a constraint then one can either choose `ALBERT` or `MobileBERT`. Remember the model performance will not be as great compared to models like `BERT`, `RoBERTa`, `XLNET`, etc.\n", "- If you have a dataset in the **French** language one can choose from `FlauBERT` or `CamemBERT` as these language model are trained on **French** text.\n", "- When dealing with **long sentences/sequences** in training data one can choose from `XLNET`, `Longformer`, `Bart`.\n", "- Some models like `XLM`, `XLM-RoBERTa` are [multi-lingual models](https://huggingface.co/transformers/v3.0.2/multilingual.html) i.e, models trained on multiple languages. If your dataset consists of text in multiple languages you can choose models mentioned in the above link. \n", "  - The model sizes of these transformer architectures are very large (in GBs). \n", "  - They require large memory to fine tune on a particular dataset.\n", "  - Due to the large size of these models, inferencing a fined-tuned model will be somewhat slow on CPU."]}, {"cell_type": "markdown", "metadata": {}, "source": ["The HuggingFace Transformers library provides a wide variety of models for each of the backbone listed above. To see the full list visit [this](https://huggingface.co/transformers/pretrained_models.html) link.\n", "\n", "- The call to `available_backbone_models` method will list out only a few of the available models for each backbone. \n", "- This list is not exhaustive and only contains a subset of the models listed in the link above. This function is created to give a general idea to the user about the available models for a given backbone.\n", "- That being said, the `TextClassifier` module supports any model from the 13 available backbones.\n", "- Some of the Transformer models are quite large due to the high number of training parameters or high number of intermediate layers. Thus large models will have large CPU/GPU memory requirements."]}, {"cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["('xlm-roberta-base', 'xlm-roberta-large')\n"]}], "source": ["print(TextClassifier.available_backbone_models(\"xlm-roberta\"))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Construct the `TextClassifier` by passing the data and the backbone you have chosen.\n", "\n", "The dataset consists of addresses in multiple languages like Japanese, English, French, Spanish, etc. hence we will use a [multi-lingual transformer backbone](https://huggingface.co/transformers/v3.0.2/multilingual.html) like `XLM-RoBERTa` to train our model."]}, {"cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [{"data": {"text/html": [], "text/plain": ["<IPython.core.display.HTML object>"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"text/html": [], "text/plain": ["<IPython.core.display.HTML object>"]}, "metadata": {}, "output_type": "display_data"}], "source": ["model = TextClassifier(data, backbone=\"xlm-roberta-base\")"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Model training"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Finding optimum learning rate\n", "\n", "In machine learning, the `learning rate`[[6]](#References) is a tuning parameter that determines the step size at each iteration while moving toward a minimum of a loss function, it represents the speed at which a machine learning model \"learns\"\n", "\n", "- If the **learning rate is low**, then model training will take a lot of time because steps towards the minimum of the loss function are tiny.\n", "- If the **learning rate is high**, then training may not converge or even diverge. Weight changes can be so big that the optimizer overshoots the minimum and makes the loss worse.\n", "\n", "We have to find an **optimum learning rate** for the dataset we wish to train our model on. To do so we will call the `lr_find()` method of the model.\n", "\n", "**Note**\n", "\n", "- A user is not required to call the `lr_find()` method separately. If `lr` argument is not provided while calling the `fit()` method then `lr_find()` method is internally called by the `fit()` method to find the optimal learning rate."]}, {"cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [{"data": {"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZwkZZ3n8c8vj8q6q/qopm+K7pFDkLNh8BhFXRWUBVRw3RVGXUZGd72vHWZmGUXdl86Mx6rjKIs6ODiiMuoiKogHiBfQ3XQ3NNBA0/RFH9VV3XVnVh6/+SOjsouiujqrKyOv+r5fr3xVROQTEb+nsip/8cQT8YS5OyIiIgCRSgcgIiLVQ0lBREQKlBRERKRASUFERAqUFEREpCBW6QBmauHChd7d3V3pMEREasq6desOuHvX0crVXFLo7u5m7dq1lQ5DRKSmmNn2Ysrp9JGIiBQoKYiISIGSgoiIFCgpiIhIgZKCiIgUKCmIiEiBkoKIiBQoKYiI1IAv/OJx7n2iJ/T9KCmIiFQ5d+dLv3qS+57qC31fSgoiIlVuZCxLNue0NYY/CIWSgohIlRtMZgBoa4yHvi8lBRGRKjeQTAOopSAiIjAYJIX2JrUURETmvIHC6SO1FERE5rzxPoV2JQURERkYHe9T0OkjEZE573BLQUlBRGTOG0ymiUWMxnj4X9lKCiIiVW4wmaGtMYaZhb4vJQURkSo3kEyXpT8BlBRERKreYDJDe1P4Vx6BkoKISNUbTKZpS6ilICIiHO5TKAclBRGRKpdPCnXSUjCzqJk9aGa3T/He28ysx8w2BK+/CDseEZFaMzCaLltLoRx7eR/wKNB+hPe/6+7vLkMcIiI1J5dzhsYyZRkMD0JuKZjZcuB1wI1h7kdEpF4NjWVwL8+4RxD+6aMvAB8FctOUeaOZbTKzW81sxVQFzOwaM1trZmt7esJ/RqmISLUYLOMIqRBiUjCzi4H97r5ummI/Brrd/XTgLuCmqQq5+w3uvsbd13R1dYUQrYhIdSrnYHgQbkvhxcAlZvY0cAvwCjO7eWIBd+9191QweyNwTojxiIjUnHIOhgchJgV3v9bdl7t7N/Bm4FfufuXEMma2ZMLsJeQ7pEVEJDBYxkdxQnmuPnoWM7seWOvutwHvNbNLgAzQB7yt3PGIiFSzcvcplGUv7n43cHcwfd2E5dcC15YjBhGRWjSQrJ8+BRERmaW6ufpIRERmbyCZpiEWoTEeLcv+lBRERKrYYDJTthvXQElBRKSq5cc9Kk9/AigpiIhUtXIOmw1KCiIiVW0wmS7bjWugpCAiUtXUUhARkYKBZPmepQBKCiIiVa2cT10DJQURkaqVyeYYGcuqT0FERGAoVd67mUFJQUSkapV7iAtQUhARqVr9ZX7ADigpiIhUrcIDdprUUhARmfPGH7CjjmYREVGfgoiIHFbuB+yAkoKISNVSS0FERAoGk2ma4lHi0fJ9VSspiIhUqXIPhgdKCiIiVavcg+GBkoKISNUq92B4oKQgIlK1BpIZ2puUFEREhHxHs04fiYgIAAOjGdqVFEREBMZbCjp9JCIy541lcqQyObUURETk8GB4aimIiAgDFRjiApQURESqkloKIiJSUInB8EBJQUSkKlXiATtQhqRgZlEze9DMbp/ivYSZfdfMnjSz+8ysO+x4RERqQT33KbwPePQI710NHHT3PwE+D3ymDPGIiFS9gdE6bCmY2XLgdcCNRyhyKXBTMH0r8EozszBjEhGpBeN9Cq111lL4AvBRIHeE95cBOwHcPQP0AwsmFzKza8xsrZmt7enpCStWEZGqMZjM0JqIEY2U9zg5tKRgZhcD+9193Wy35e43uPsad1/T1dVVguhERKpbJQbDg3BbCi8GLjGzp4FbgFeY2c2TyuwGVgCYWQzoAHpDjElEpCZU4gE7EGJScPdr3X25u3cDbwZ+5e5XTip2G/DWYPryoIyHFZOISK2oxAN2oAL3KZjZ9WZ2STD7dWCBmT0JfBD4q3LHIyJSjQaT5R82G6Ase3T3u4G7g+nrJixPAleUIwYRkVoymExzwsKWsu9XdzSLiFShgWSmvvoURETk2Lh7RR6wA0oKIiJVJ5XJkc66WgoiIpK/HBWgvUktBRGROW9gND/ERSWuPlJSEBGpMocfsKOkICIy5x1+wI5OH4mIzHnjSaHcw2aDkoKISNXpHU4BMK9ZSUFEZM57qmeYloYoXW2Jsu9bSUFEpMps7RliVVcrlXjmmJKCiEiVeapnmNVd5R/3CJQURESqyshYht2HRlnV1VqR/SspiIhUkW0HhgFYraQgIiJbe4KksEinj0RE5ryneoYwg+4FSgoiInPe1p5hls9rojEercj+lRRERKrI1v1DFetPACUFEZGqkcs52w4Ms2qhkoKIyJy3ZyDJaDpbsU5mUFIQEakaT/UMAailICIi+f4EqNzlqKCkICJSNbb2DNPWGKOrtfwD4Y1TUhARqRJPHajcQHjjlBRERKrE1v2VGwhvXFFJwcxazCwSTJ9oZpeYWfmf/iAiUqeGUhn2DiQreo8CFN9S+A3QaGbLgF8Cbwf+JaygRETmmm3jYx7VQksBMHcfAd4AfMndXw+cGl5YIiJzy9bgctRaaSmYmb0QeAvwk2BZZQbmEBGpQ0/1DBExWLmguaJxFJsU3g9cC/zQ3Teb2Srg1+GFJSIyt2ztGWbl/GYSscoeb8eKKeTu9wD3AAQdzgfc/b1hBiYiMpds7ansQHjjir366N/MrN3MWoBHgC1m9pGjrNNoZveb2UYz22xmH5+izNvMrMfMNgSvvzi2aoiI1K7s+EB4Fe5khuJPHz3f3QeAy4CfAiuBq46yTgp4hbufAZwJXGhm509R7rvufmbwurHYwEVE6sUzh0ZJZXK101IA4sF9CZcB/9/d04BPt4LnDY2vH7ymXUdEZC4av/JoVQ0lha8BTwMtwG/M7Hhg4GgrmVnUzDYA+4G73P2+KYq90cw2mdmtZraiyHhEROrG1iq5RwGKTAru/kV3X+burw1aANuBlxexXtbdzwSWA+eZ2WmTivwY6Hb304G7gJum2o6ZXWNma81sbU9PTzEhi4jUjK09Q3Q2x5nf0lDpUIruaO4ws8+NfzGb2WfJtxqK4u6HyF/CeuGk5b3ungpmbwTOOcL6N7j7Gndf09XVVexuRURqwtb9Q6xa2FLRgfDGFXv66BvAIPCm4DUAfHO6Fcysy8w6g+km4FXAY5PKLJkwewnwaJHxiIjUhXQ2x0O7+zltWUelQwGKvE8BWO3ub5ww//Ggr2A6S4CbzCxKPvl8z91vN7PrgbXufhvwXjO7BMgAfcDbZha+iEht2/zMACNjWc47YX6lQwGKTwqjZvYSd/8tgJm9GBidbgV33wScNcXy6yZMX0v+TmkRkTnpgW19AJzXXVtJ4Z3At8xsvH1zEHhrOCGJiMwd923ro3tBM4vaGysdClD81Ucbg5vQTgdOd/ezgFeEGpmISJ3L5Zy12/uq5tQRzPDJa+4+ENzZDPDBEOIREZkzntg/xKGRNOdWyakjmN3jOCt/7ZSISA27/+l8f8KfnrCgwpEcNpukoCErRERm4f5tfSxub2TF/KZKh1IwbUezmQ0y9Ze/AdVTCxGRGuPuPLCtj3NPmF8VN62NmzYpuHtbuQIREZlLdvaNsncgWVWdzDC700ciInKMxvsTquX+hHFKCiIiFXD/tl46m+M8b1Hlh8ueSElBRKQC7t/Wx5rj5xOJVE9/AigpiIiU3f6BJE/3jvCnVdafAEoKIiJlN96fcK6SgoiIPLCtj+aGKKcuba90KM+hpCAiUmb3bevj7JXziEer7yu4+iISEalj/SNptuwbrLr7E8YpKYiIlNH6HQdxhzXd8yodypSUFEREymjd9oNEI8aZKzorHcqUlBRERMpo/Y6DnLKkjeaGYp9xVl5KCiIiZZLJ5tiw8xBnr6zOU0egpCAiUjZb9g0yMpblnOOVFERE5rz12w8CqKUgIiKwfschutoSLJ9XvY+jUVIQESmTddsPcvbKzqp6qM5kSgoiImXQM5hiR99IVfcngJKCiEhZrN9R/f0JoKQgIlIW63ccJB41TlvWUelQpqWkICJSBuu3H+TUpR00xqOVDmVaSgoiIiEby+TYtKu/6vsTQElBRCR0j+wZIJXJVX1/AigpiIiErnDT2vHVOQjeREoKIiIhW7fjIEs7GlnSUb03rY1TUhARCdmD2w9ydg30J0CIScHMGs3sfjPbaGabzezjU5RJmNl3zexJM7vPzLrDikdEpBL29I/yTH+yJvoTINyWQgp4hbufAZwJXGhm508qczVw0N3/BPg88JkQ4xERKbv12w8B1MSVRxBiUvC8oWA2Hrx8UrFLgZuC6VuBV1o1DwoiIjJDa7f3kYhFOGVJe6VDKUqofQpmFjWzDcB+4C53v29SkWXATgB3zwD9wIIptnONma01s7U9PT1hhiwiUjLuzi8f3c/5qxbQEKuNLtxQo3T3rLufCSwHzjOz045xOze4+xp3X9PV1VXaIEVEQvLIngF29I1w0WmLKx1K0cqSutz9EPBr4MJJb+0GVgCYWQzoAHrLEZOISNjufHgvEYNXPf+4SodStDCvPuoys85gugl4FfDYpGK3AW8Npi8HfuXuk/sdRERq0h2b93LeCfNZ0JqodChFC7OlsAT4tZltAh4g36dwu5ldb2aXBGW+DiwwsyeBDwJ/FWI8IiJls7VniMf3DXHhqbVz6gggFtaG3X0TcNYUy6+bMJ0ErggrBhGRSrnj4b0AvKaG+hNAdzSLiITizs17OXNFZ00MbTGRkoKISIntOjjCpl39XFhjrQRQUhARKbk7N+8DqLn+BFBSEBEpuTsf3svJi9voXthS6VBmTElBRKSE9g8meWB7X02eOgIlBRGRkrrrkX24w0WnLal0KMdESUFEpITueHgvJyxs4cTjWisdyjFRUhARKZH9g0n+sLWX15y6mFod8FlJQUSkRG65fyeZnPOmNcsrHcoxU1IQESmBdDbHt+/bzktP7GJVV22eOgIlBRGRkvj55n3sG0jx1hceX+lQZkVJQUSkBG76w9OsnN/MBSctqnQos6KkICIyS4/uGeD+bX1cdf7xRCO12cE8TklBRGSWvvWH7TTGI1xRwx3M45QURERmoX8kzY8e3M1lZy6js7mh0uHMmpKCiMgsfH/dTkbTWa6q8Q7mcUoKIiLHKJdzvvWH7ZzbPY9Tl3ZUOpySUFIQETlG9zzRw46+Ef78hd2VDqVklBRERI7RnQ/vpS0R4zU1+NyEI1FSEBE5Bu7O3Vt6eMnzFtIQq5+v0vqpiYhIGT2+b4i9A0kuOKmr0qGUlJKCiMgxuOfx/QC89EQlBRGROe/uLT2cdFwbSzqaKh1KSSkpiIjM0HAqwwNP99XdqSNQUhARmbE/bO0lnXVeVmenjkBJQURkxu5+fD/NDVHO6Z5X6VBKTklBRGQGxi9FfdHqhSRi0UqHU3JKCiIiM7DtwDC7Do7ysjrsTwAlBRGRGbl7Sw8AF9RhfwIoKYiIzMg9j/ewqquFFfObKx1KKJQURESKlExn+eNTvXV51dE4JQURkSL98aleUpmcksKxMLMVZvZrM3vEzDab2fumKHOBmfWb2YbgdV1Y8YiIzNY9j/eQiEU4f9WCSocSmliI284AH3L39WbWBqwzs7vc/ZFJ5e5194tDjENEZNZ6h1LcvmkP569aQGO8/i5FHRdaS8Hd97j7+mB6EHgUWBbW/kREwpLJ5njPdx5kYDTNR15zUqXDCVVZ+hTMrBs4C7hvirdfaGYbzexnZnbqEda/xszWmtnanp6eECMVEXmuf/j5Fn6/tZdPvf4FnLasPh67eSShJwUzawX+HXi/uw9Mens9cLy7nwF8CfjRVNtw9xvcfY27r+nqqt8OHhGpPj/ZtIev3fMUV56/ksvPWV7pcEIXalIwszj5hPBtd//B5PfdfcDdh4LpnwJxM1sYZkwiIsV6Yt8gH7l1I2et7OS6i6c8kVF3wrz6yICvA4+6++eOUGZxUA4zOy+IpzesmEREirX70Ch/+a/raG6I8c9vOaeuHrk5nTCvPnoxcBXwkJltCJb9NbASwN2/ClwOvMvMMsAo8GZ39xBjEhGZVjKd5YbfPMVX7n4SgJvefh6LOxorHFX5hJYU3P23gB2lzJeBL4cVg4hIsdydOzfv5ZM/eZRdB0d53QuWcO1rT2b5vPoczuJIwmwpiIjUhHXb+/j7O7Zw37Y+Tl7cxnfecT4vXF2/N6hNR0lBROaETDZHLPrsfoHNz/Tz2Z8/zq8e28/C1gSfuPRU/ut5K59Tbi5RUhCRuubufPInj/KN322jtSHGwrYEC1sbiEcj/H5rL+2NMT564Um87UXdNDfoK1G/ARGpa1/4xRN8/bfbuPj0JSxsTXBgKEXPYIq+4TH+58tXc81LV9PRFK90mFVjziSF0bEsvcMpxq9tcoecO+OXOnkw3RCNcFx745y5/ExkWlu3wmc/CzffDEND0NoKV14JH/oQrF5d6eiO6pu/28b//eUTvGnNcj7zxtMJroCXacyZpPDLx/bx7n97sKiyZrCoLcHSziaWdjRhBulsjrFMjnTWSWdzZHNO1p1czolEjJMXt3H68k7OWN7Jice1zulzklInfvYzuPxySKfzL4DBQbjxRrjpJrj1VrjoosrGOI0fPriLj//4EV79/OP4P69/gRJCkazWbgtYs2aNr127dsbr7ewb4Q9be8Hy18lGzDDLJwALrpw1y1+j/MyhJM8cGuWZ/lH29CeBfAsiHo3QEIsQjRixiBGNGBEzxjI5HtkzQP9o/h+nMR5h5fxmFrQkWNDawMLWBB1NcaKR/J4ikfy+I2ZEzYhEjIhBNucMpTIMpzIMpTKMjGVpTcTobI4zr7mBjqY4DbEI6ayTyeZI5/JJKRox4lEjGokU4ir8jBoN0SitjTHaglciGmXLvkE27TrExl39PLTrED2DqSDJQdadiMGClgSL2hMsamtkUXuCxkkPKY9GoCkepakhRnNDlKaGKC0NMZoTwc+GKKlMjj39o/nf56EkPUMp4hEjEYvSGI+QiEeJRp77zxoxiJphlq9Hc0OU1kSMlkSM1sYYiVgEHHIO4+298W02xaM0xqOYUahPNueMZXL0j6aD1xj9o2nSWZ9UJ6M1cfh31dYYD+oYzdcxHiUSMbJZJ5PzwsFBLPidxyIRYlELPpPIlHWrCVu3wumnw8jIkcs0N8OmTdO2GNLZHH3DYxwYStE7lP+dR8wm/M0aLYkYHU1x2hvjtDfFiJjROzzGgcEUB4ZSHBpJc/yCZp6/tL3oc/6/eGQff3nzOs7rns83335uXY9qWiwzW+fua45Wbs60FFbMbw718XnuzvbeETbuOsSmXf3sOjhC79AYm58Z4MBQisFkpuhtNTdEaUnEaIpHGU5lODSaJpsLJ3kv7Wjk9OWd/KdTmvJJLpJPVFl3Dgym2DeYYtfBEdbvOEg6k3vWupmcM5rOFr2v1kSMRW0Jsu4k01lSmRzJdJbcszeL4+TGT+/V1jHLc5hBPBIhHjWaE7FCYmlqiJJzSGdypLP5V2M8ytLOJpZ1NrFsXhNLOhpZ2JpgXnMD81ryBwbZnHNoNE3/SJpDo2Oks87i9kYWdzTS3hjDzHB3+kfTbO8dYXvfCGOZHCvnN3P8gma6WhNEIkYu5/QMpdjeO8KOvhFG01nikcPJ7OzPfILlY+lphzxIJ1P8/p3X8tC1n+Kkxe10L2hme+8IDz/Tz+ZnBti8u59ngoOqUogYrO5q5QXLOjh/1QJee/oSWhPP/grbdXCEf7xzCz/a8AwvWNbBDX9+jhLCDM2ZlkKl5XL549nxL7qcO7ngCDbnFE5DtTREn3PqyT3fgjg0kiadzRWORmPRw1/g2ZyTKRy95sgE89lc/gt4KJVhMJlhMJVhdCzD6q5WTl/eSVdbYlb1cneS6RzDYxlGx7KMjGUZHsswksr/bIhGWNrZxJLORtobZ96Z50HdRtJZhpL5VtRgKsNYJocBFrT4AFLpfJIZTWdJprO451tl0Ui+VdYQjdDRFKejOU5n0PKKR4NWYtBaTGdzh39XyXShxZZMZwv1ywUtg2g03zKLGIVWw3grLlP4PPKnHMcyOUbT+W0Np/Lbi0SMhqjREMu3QodTWXYfyreqxludM9HcEGVRW4Le4bEjHoQkYhEWtSfYP5AiNSnJT/TQ56+gbWz0qPscSjRz2vu/96xlZrBqYQunLeuge0ELC9sSdE1oMTsU/jbTuRzDqQwDoxn6R9MMJPMHQAta8uUXtiVoa4yxrWeYh3b38/Dufjbt7qdnMEVTPMpFL1jMm9as4JTF7Xzlnif55u+exoCrX3IC77pgNW3H8DdXr4ptKSgpiFShwWSavf1J+obHODgyRt9wmoMjY0QjRmdTnM7mOB1NDcSixr6BJHv7k+zpT7JvIMn8lgZWzm8OWgctxKPGzoOj7OgdZkffCPsGUhzXnmDlgpZCuZZEtHBgkc7mOOG4dqyY74ZIhKGRFFv2DvL0gWGOX9DMKUvaaUmEdxLC3Xlw5yG+v3YnP964h6FUhmjEyOacN5y9jA+/+iSWdjaFtv9apaQgIseuvT3fqVxMuf7+8OM5gtGxLD97eA8bdh7iTWtW1P2zDmaj2KSgS2RE5LmuvBLiRzn1Eo/DVVeVJ54jaGqI8oazl3P9pacpIZSIkoKIPNeHPlRcUvjAB8oTj5SNkoKIPNfq1fn7EJqbn5sc4vH88ltvrYkb2GRmlBREZGoXXZS/D+Gaa/J9B5FI/uc11+SXV/GNa3Ls1NEsIjIHqKNZRERmTElBREQKlBRERKRASUFERApqrqPZzHqA7UUU7QCKudVyunJTvTd52XTzR5peCBwoIrYjUd1mX7fJy+ZK3SbOl7JuR4pjJmWOpW6T56earta6TbU8zLod7+5dRy3l7nX5Am6Ybbmp3pu8bLr5aabXqm6VrdsM6lNXdZs4X8q6FVu/UtetmM+uWut2tLqUq26TX/V8+ujHJSg31XuTl003f6Tp2VLdZl+3ycvmSt0mzpeybsVur9R1mzxfyc9upnWbankl6vYsNXf6qB6Y2Vov4nrhWqS61SbVrTaFUbd6bilUsxsqHUCIVLfapLrVppLXTS0FEREpUEtBREQKlBRERKRASWEWzOwbZrbfzB4+hnXPMbOHzOxJM/ui2fiThsHM3mNmj5nZZjP7+9JGXXR8Ja+bmX3MzHab2Ybg9drSR150jKF8dsH7HzIzN7OFpYt4RvGF8dl9wsw2BZ/bz81saekjLyq+MOr2D8H/2yYz+6GZdZY+8qLiC6NuVwTfIzkzK65DutTXuM6lF/BS4Gzg4WNY937gfMCAnwEXBctfDvwCSATzi+qobh8DPlzpzy2s+gXvrQDuJH+D5cJ6qRvQPqHMe4Gv1lHdXg3EgunPAJ+po7qdApwE3A2sKWZbainMgrv/BuibuMzMVpvZHWa2zszuNbOTJ69nZkvI/5P90fOf3LeAy4K33wV82t1TwT72h1uLqYVUt6oRYv0+D3wUqNgVHGHUzd0HJhRtoUL1C6luP3f3TFD0j8DycGsxtZDq9qi7b5lJHEoKpXcD8B53Pwf4MPCVKcosA3ZNmN8VLAM4EfgzM7vPzO4xs3NDjXZmZls3gHcHzfRvmNm88EI9JrOqn5ldCux2941hB3oMZv3ZmdmnzGwn8BbguhBjnalS/F2O++/kj7SrRSnrVpTYsa4oz2VmrcCLgO9POM2cmOFmYsB88k3Bc4Hvmdmq4AigYkpUt38GPkH+KPMTwGfJ/xNW3GzrZ2bNwF+TPxVRVUr02eHufwP8jZldC7wb+LuSBXmMSlW3YFt/A2SAb5cmutkpZd1mQkmhtCLAIXc/c+JCM4sC64LZ28h/OU5soi4HdgfTu4AfBEngfjPLkR/0qifMwIsw67q5+74J6/0/4PYwA56h2dZvNXACsDH4B14OrDez89x9b8ixH00p/i4n+jbwU6ogKVCiupnZ24CLgVdW+gBsglJ/bsWpRIdKPb2AbiZ0DAG/B64Ipg044wjrTe4Yem2w/J3A9cH0icBOgpsM66BuSyaU+QBwSz19dpPKPE2FOppD+uyeN6HMe4Bb66huFwKPAF2V/HsM82+SGXQ0V/QXUOsv4DvAHiBN/gj/avJHi3cAG4M/tOuOsO4a4GFgK/Dl8S9+oAG4OXhvPfCKOqrbvwIPAZvIH+EsKVd9ylG/SWUqlhRC+uz+PVi+ifxgbMvqqG5Pkj/42hC8KnVlVRh1e32wrRSwD7jzaHFomAsRESnQ1UciIlKgpCAiIgVKCiIiUqCkICIiBUoKIiJSoKQgdcHMhsq8v9+XaDsXmFl/MProY2b2j0Wsc5mZPb8U+xeZTElBZApmNu3d/u7+ohLu7l7P37V6FnCxmb34KOUvA5QUJBQa5kLqlpmtBv4J6AJGgHe4+2Nm9p+BvyV/o2Av8BZ332dmHwOWkr+r9ICZPQ6sBFYFP7/g7l8Mtj3k7q1mdgH5IcEPAKeRH37gSnd3yz8v4nPBe+uBVe5+8ZHidfdRM9vA4QH23gFcE8T5JHAVcCZwCfAyM/tb4I3B6s+p5yx+dTKHqaUg9exII0z+Fjjf3c8CbiE/1PW4c4BL3f2/BfMnA68BzgP+zsziU+znLOD95I/eVwEvNrNG4Gvkx7V/Cfkv7GkFo8Y+D/hNsOgH7n6uu58BPApc7e6/J383+Efc/Ux33zpNPUVmTC0FqUtHGWFyOfDdYBz6BmDbhFVvc/fRCfM/8fyzLVJmth84jmcPUwxwv7vvCva7gXxLYwh4yt3Ht/0d8kf9U/kzM9tE/mEon/bDA+idZmafBDqBVvIP75lJPUVmTElB6tWUI0wGvgR8zt1vm3D6Z9zwpLKpCdNZpv6fKabMdO5194vN7ETgXjP7obtvAP4FuMzdNwajeF4wxbrT1VNkxnT6SOqS558Uts3MrgCwvDOCtzs4PLTwW0MKYQuwysy6g/n/crQV3P1x4NPA/woWtQF7glNWb5lQdDB472j1FJkxJQWpF81mtmvC64Pkv0ivNrONwGbg0qDsx8ifbrmXfCdwyQWnoP4HcIeZ/Zb8CEvnCNYAAAB2SURBVJX9Raz6VeClQTL538B9wF3AxI7jW4CPmNmDQWf6keopMmMaJVUkJGbW6u5Dlj/Z/0/AE+7++UrHJTIdtRREwvOOoON5M/lTVl+rcDwiR6WWgoiIFKilICIiBUoKIiJSoKQgIiIFSgoiIlKgpCAiIgX/AaRWtBULyfUSAAAAAElFTkSuQmCC\n", "text/plain": ["<Figure size 432x288 with 1 Axes>"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}, {"data": {"text/plain": ["0.001445439770745928"]}, "execution_count": 10, "metadata": {}, "output_type": "execute_result"}], "source": ["model.lr_find()"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Training the model is an iterative process. We can train the model using its `fit()` method till the validation loss (or error rate) continues to go down with each training pass also known as epoch. This is indicative of the model learning the task."]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"data": {"text/html": ["<table border=\"1\" class=\"dataframe\">\n", "  <thead>\n", "    <tr style=\"text-align: left;\">\n", "      <th>epoch</th>\n", "      <th>train_loss</th>\n", "      <th>valid_loss</th>\n", "      <th>accuracy</th>\n", "      <th>error_rate</th>\n", "      <th>time</th>\n", "    </tr>\n", "  </thead>\n", "  <tbody>\n", "    <tr>\n", "      <td>0</td>\n", "      <td>0.173400</td>\n", "      <td>0.111699</td>\n", "      <td>0.956300</td>\n", "      <td>0.043700</td>\n", "      <td>05:24</td>\n", "    </tr>\n", "    <tr>\n", "      <td>1</td>\n", "      <td>0.062744</td>\n", "      <td>0.044339</td>\n", "      <td>0.981100</td>\n", "      <td>0.018900</td>\n", "      <td>05:15</td>\n", "    </tr>\n", "    <tr>\n", "      <td>2</td>\n", "      <td>0.040257</td>\n", "      <td>0.029966</td>\n", "      <td>0.986300</td>\n", "      <td>0.013700</td>\n", "      <td>05:22</td>\n", "    </tr>\n", "    <tr>\n", "      <td>3</td>\n", "      <td>0.032077</td>\n", "      <td>0.024974</td>\n", "      <td>0.989300</td>\n", "      <td>0.010700</td>\n", "      <td>05:32</td>\n", "    </tr>\n", "    <tr>\n", "      <td>4</td>\n", "      <td>0.030770</td>\n", "      <td>0.024296</td>\n", "      <td>0.989800</td>\n", "      <td>0.010200</td>\n", "      <td>05:19</td>\n", "    </tr>\n", "    <tr>\n", "      <td>5</td>\n", "      <td>0.027273</td>\n", "      <td>0.023898</td>\n", "      <td>0.990600</td>\n", "      <td>0.009400</td>\n", "      <td>05:21</td>\n", "    </tr>\n", "  </tbody>\n", "</table>"], "text/plain": ["<IPython.core.display.HTML object>"]}, "metadata": {}, "output_type": "display_data"}], "source": ["model.fit(epochs=6, lr=0.001)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### Evaluate model performance"]}, {"cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.9906"]}, "execution_count": 14, "metadata": {}, "output_type": "execute_result"}], "source": ["model.accuracy()"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Other important metrics to look at are Precision, Recall & F1-measures [[7]](#References).\n", "\n", "Here is a brief description of them:\n", "- **Precision** -  Precision talks about how precise/accurate your model is. Out of those predicted positive, how many of them are actually positive.  \n", "- **Recall** - Recall is the ability of the classifier to find all the positive samples.\n", "- **F1** - F1 can be interpreted as a weighted harmonic mean of the precision and recall\n", "\n", "To learn more about these metrics one can visit the following link - [Precision, Recall & F1 score](https://en.wikipedia.org/wiki/Precision_and_recall)\n", "\n", "To find `precision`, `recall` & `f1` scores per label/class we will call the model's `metrics_per_label()` method."]}, {"cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [{"data": {"text/html": ["\n", "    <div>\n", "        <style>\n", "            /* Turns off some styling */\n", "            progress {\n", "                /* gets rid of default border in Firefox and Opera. */\n", "                border: none;\n", "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n", "                background-size: auto;\n", "            }\n", "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n", "                background: #F44336;\n", "            }\n", "        </style>\n", "      <progress value='10000' class='' max='10000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n", "      100.00% [10000/10000 05:15<00:00]\n", "    </div>\n", "    "], "text/plain": ["<IPython.core.display.HTML object>"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"text/html": ["<div>\n", "<style scoped>\n", "    .dataframe tbody tr th:only-of-type {\n", "        vertical-align: middle;\n", "    }\n", "\n", "    .dataframe tbody tr th {\n", "        vertical-align: top;\n", "    }\n", "\n", "    .dataframe thead th {\n", "        text-align: right;\n", "    }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", "  <thead>\n", "    <tr style=\"text-align: right;\">\n", "      <th></th>\n", "      <th>Precision_score</th>\n", "      <th>Recall_score</th>\n", "      <th>F1_score</th>\n", "      <th>Support</th>\n", "    </tr>\n", "  </thead>\n", "  <tbody>\n", "    <tr>\n", "      <th>AU</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>929.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>BE</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1043.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>BR</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>950.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>CA</th>\n", "      <td>0.9295</td>\n", "      <td>0.9799</td>\n", "      <td>0.9541</td>\n", "      <td>996.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>ES</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>982.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>FR</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1009.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>JP</th>\n", "      <td>1.0000</td>\n", "      <td>0.9990</td>\n", "      <td>0.9995</td>\n", "      <td>989.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>MX</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1024.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>US</th>\n", "      <td>0.9803</td>\n", "      <td>0.9318</td>\n", "      <td>0.9554</td>\n", "      <td>1070.0</td>\n", "    </tr>\n", "    <tr>\n", "      <th>ZA</th>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1.0000</td>\n", "      <td>1008.0</td>\n", "    </tr>\n", "  </tbody>\n", "</table>\n", "</div>"], "text/plain": ["    Precision_score  Recall_score  F1_score  Support\n", "AU           1.0000        1.0000    1.0000    929.0\n", "BE           1.0000        1.0000    1.0000   1043.0\n", "BR           1.0000        1.0000    1.0000    950.0\n", "CA           0.9295        0.9799    0.9541    996.0\n", "ES           1.0000        1.0000    1.0000    982.0\n", "FR           1.0000        1.0000    1.0000   1009.0\n", "JP           1.0000        0.9990    0.9995    989.0\n", "MX           1.0000        1.0000    1.0000   1024.0\n", "US           0.9803        0.9318    0.9554   1070.0\n", "ZA           1.0000        1.0000    1.0000   1008.0"]}, "execution_count": 15, "metadata": {}, "output_type": "execute_result"}], "source": ["model.metrics_per_label()"]}, {"cell_type": "markdown", "metadata": {}, "source": ["###  Validate results\n", "\n", "Once we have the trained model, we can see the results to see how it performs."]}, {"cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [{"data": {"text/html": ["<style  type=\"text/css\" >\n", "    #T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070 th {\n", "          text-align: left;\n", "    }#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col2,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col0,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col1,#T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col2{\n", "            text-align:  left;\n", "        }</style><table id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070\" ><thead>    <tr>        <th class=\"col_heading level0 col0\" >text</th>        <th class=\"col_heading level0 col1\" >target</th>        <th class=\"col_heading level0 col2\" >prediction</th>    </tr></thead><tbody>\n", "                <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col0\" class=\"data row0 col0\" >SN, AVENIDA JOSE MARIA MORELOS Y PAVON OTE., APATZING\u00c1N DE LA CONSTITUCI\u00d3N, Apatzing\u00e1n, Michoac\u00e1n de Ocampo</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col1\" class=\"data row0 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row0_col2\" class=\"data row0 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col0\" class=\"data row1 col0\" >906, AVENIDA JOSEFA ORT\u00cdZ DE DOM\u00cdNGUEZ, CIUDAD MENDOZA, Camerino Z. Mendoza, Veracruz de Ignacio de la Llave</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col1\" class=\"data row1 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row1_col2\" class=\"data row1 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col0\" class=\"data row2 col0\" >32, CIRCUITO JOS\u00c9 MAR\u00cdA URIARTE, FRACCIONAMIENTO RANCHO ALEGRE, Tlajomulco de Z\u00fa\u00f1iga, Jalisco</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col1\" class=\"data row2 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row2_col2\" class=\"data row2 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col0\" class=\"data row3 col0\" >SN, ESTRADA SP 250 SENTIDO GRAMADAO, LADO DIREITO FAZENDA SAO RAFAEL CASA 4, S\u00e3o Miguel Arcanjo, SP, 18230-000</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col1\" class=\"data row3 col1\" >BR</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row3_col2\" class=\"data row3 col2\" >BR</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col0\" class=\"data row4 col0\" >SN, CALLE JOSEFA ORT\u00cdZ DE DOM\u00cdNGUEZ, RINC\u00d3N DE BUENA VISTA, Omealca, Veracruz de Ignacio de la Llave</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col1\" class=\"data row4 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row4_col2\" class=\"data row4 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col0\" class=\"data row5 col0\" >SN, CALLE MICHOACAN, DOLORES HIDALGO CUNA DE LA INDEPENDENCIA NACIONAL, Dolores Hidalgo Cuna de la Independencia Nacional, Guanajuato</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col1\" class=\"data row5 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row5_col2\" class=\"data row5 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col0\" class=\"data row6 col0\" >SN, CALLE VERDUZCO, COALCOM\u00c1N DE V\u00c1ZQUEZ PALLARES, Coalcom\u00e1n de V\u00e1zquez Pallares, Michoac\u00e1n de Ocampo</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col1\" class=\"data row6 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row6_col2\" class=\"data row6 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col0\" class=\"data row7 col0\" >1712, CALLE M\u00c1RTIRES DEL 7 DE ENERO, CIUDAD MENDOZA, Camerino Z. Mendoza, Veracruz de Ignacio de la Llave</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col1\" class=\"data row7 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row7_col2\" class=\"data row7 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col0\" class=\"data row8 col0\" >SN, AVENIDA JACOBO G\u00c1LVEZ, FRACCIONAMIENTO RANCHO ALEGRE, Tlajomulco de Z\u00fa\u00f1iga, Jalisco</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col1\" class=\"data row8 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row8_col2\" class=\"data row8 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col0\" class=\"data row9 col0\" >SN, ANDADOR MZNA 6 AMP. LOS ROBLES, EL PUEBLITO (CRUCERO NACIONAL), C\u00f3rdoba, Veracruz de Ignacio de la Llave</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col1\" class=\"data row9 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row9_col2\" class=\"data row9 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col0\" class=\"data row10 col0\" >SN, CALLE S\u00c9PTIMA PONIENTE SUR (EJE VIAL), COMIT\u00c1N DE DOM\u00cdNGUEZ, Comit\u00e1n de Dom\u00ednguez, Chiapas</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col1\" class=\"data row10 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row10_col2\" class=\"data row10 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col0\" class=\"data row11 col0\" >18, CALLE FELIPE GORRITI / FELIPE GORRITI KALEA, Pamplona / Iru\u00f1a, Pamplona / Iru\u00f1a, Navarra, 31004</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col1\" class=\"data row11 col1\" >ES</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row11_col2\" class=\"data row11 col2\" >ES</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col0\" class=\"data row12 col0\" >SN, RUA X VINTE E SEIS, QUADRA 14 LOTE 35 SALA 3, Aparecida de Goi\u00e2nia, GO, 74922-680</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col1\" class=\"data row12 col1\" >BR</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row12_col2\" class=\"data row12 col2\" >BR</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col0\" class=\"data row13 col0\" >SN, CALLE NINGUNO, HEROICA CIUDAD DE JUCHIT\u00c1N DE ZARAGOZA, Heroica Ciudad de Juchit\u00e1n de Zaragoza, Oaxaca</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col1\" class=\"data row13 col1\" >MX</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row13_col2\" class=\"data row13 col2\" >MX</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col0\" class=\"data row14 col0\" >1169, RUA DOUTOR ALBUQUERQUE LINS, BLOCO B ANDAR 11 APARTAMENTO 112B, S\u00e3o Paulo, SP, 01203-001</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col1\" class=\"data row14 col1\" >BR</td>\n", "                        <td id=\"T_7f28c7ac_33bc_11eb_9747_a4bb6dafa070row14_col2\" class=\"data row14 col2\" >BR</td>\n", "            </tr>\n", "    </tbody></table>"], "text/plain": ["<pandas.io.formats.style.Styler at 0x15484f35a08>"]}, "metadata": {}, "output_type": "display_data"}], "source": ["model.show_results(15)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Test the model prediction on an input text"]}, {"cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["('1016, 8A, CL RICARDO LEON - SANTA ANA (CARTAGENA), 30319', 'ES', 1.0)\n"]}], "source": ["text = \"\"\"1016, 8A, CL RICARDO LEON - SANTA ANA (CARTAGENA), 30319\"\"\"\n", "print(model.predict(text))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Once you are satisfied with the model, you can save it using the `save()` method. This creates a **Deep Learning Package (DLPK file)** that can be used for inferencing on unseen data. "]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Computing model metrics...\n"]}, {"data": {"text/plain": ["WindowsPath('models/country-classifier')"]}, "execution_count": 18, "metadata": {}, "output_type": "execute_result"}], "source": ["model.save(\"country-classifier\")"]}, {"cell_type": "markdown", "metadata": {}, "source": ["# Model inference\n", "\n", "The trained model can be used to classify new text documents using the `predict()` method. This method accepts a string or a list of strings to predict the labels of these new documents/text."]}, {"cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [{"data": {"text/html": ["\n", "    <div>\n", "        <style>\n", "            /* Turns off some styling */\n", "            progress {\n", "                /* gets rid of default border in Firefox and Opera. */\n", "                border: none;\n", "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n", "                background-size: auto;\n", "            }\n", "            .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {\n", "                background: #F44336;\n", "            }\n", "        </style>\n", "      <progress value='15' class='' max='15' style='width:300px; height:20px; vertical-align: middle;'></progress>\n", "      100.00% [15/15 00:00<00:00]\n", "    </div>\n", "    "], "text/plain": ["<IPython.core.display.HTML object>"]}, "metadata": {}, "output_type": "display_data"}, {"data": {"text/html": ["<style  type=\"text/css\" >\n", "    #T_9905dec2_33bd_11eb_956a_a4bb6dafa070 th {\n", "          text-align: left;\n", "    }#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col2,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col0,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col1,#T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col2{\n", "            text-align:  left;\n", "        }</style><table id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070\" ><thead>    <tr>        <th class=\"col_heading level0 col0\" >Address</th>        <th class=\"col_heading level0 col1\" >CountryCode</th>        <th class=\"col_heading level0 col2\" >Confidence</th>    </tr></thead><tbody>\n", "                <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col0\" class=\"data row0 col0\" >136, AV MARINA ALTA DE LA, 3740</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col1\" class=\"data row0 col1\" >ES</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row0_col2\" class=\"data row0 col2\" >0.999972</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col0\" class=\"data row1 col0\" >3, CL CLOTS DELS, 43791</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col1\" class=\"data row1 col1\" >ES</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row1_col2\" class=\"data row1 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col0\" class=\"data row2 col0\" >FAZENDA  LAJEADO, Mimoso do Sul, ES, 29400-000</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col1\" class=\"data row2 col1\" >BR</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row2_col2\" class=\"data row2 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col0\" class=\"data row3 col0\" >118, CALLE MONTE DE PIEDAD, SAN JUAN DE LOS LAGOS, San Juan de los Lagos, Jalisco</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col1\" class=\"data row3 col1\" >MX</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row3_col2\" class=\"data row3 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col0\" class=\"data row4 col0\" >138A, CALLE EMILIANO ZAPATA, CIUDAD GUZM\u00c1N, Zapotl\u00e1n el Grande, Jalisco</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col1\" class=\"data row4 col1\" >MX</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row4_col2\" class=\"data row4 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col0\" class=\"data row5 col0\" >28, Rue Gustave Eiffel, Brie-Comte-Robert, 77170</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col1\" class=\"data row5 col1\" >FR</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row5_col2\" class=\"data row5 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col0\" class=\"data row6 col0\" >19235, AVENUE 6, MADERA, 93637</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col1\" class=\"data row6 col1\" >US</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row6_col2\" class=\"data row6 col2\" >0.999995</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col0\" class=\"data row7 col0\" >2734, CALLE G\u00d3MEZ FAR\u00cdAS, GUADALAJARA, Guadalajara, Jalisco</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col1\" class=\"data row7 col1\" >MX</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row7_col2\" class=\"data row7 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col0\" class=\"data row8 col0\" >4237, WHISKEY AVE</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col1\" class=\"data row8 col1\" >CA</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row8_col2\" class=\"data row8 col2\" >0.542203</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col0\" class=\"data row9 col0\" >224, SWANSEA ROAD, MOUNT EVELYN, VIC, 3796</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col1\" class=\"data row9 col1\" >AU</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row9_col2\" class=\"data row9 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col0\" class=\"data row10 col0\" >920, N  MARTIN L KING BLVD</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col1\" class=\"data row10 col1\" >US</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row10_col2\" class=\"data row10 col2\" >0.998366</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col0\" class=\"data row11 col0\" >13, HOLBERG STREET, MOONEE PONDS, VIC, 3039</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col1\" class=\"data row11 col1\" >AU</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row11_col2\" class=\"data row11 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col0\" class=\"data row12 col0\" >8, AVENIDA RONCESVALLES / ORREAGA ETORBIDEA, Pamplona / Iru\u00f1a, Pamplona / Iru\u00f1a, Navarra, 31002</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col1\" class=\"data row12 col1\" >ES</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row12_col2\" class=\"data row12 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col0\" class=\"data row13 col0\" >36, Rue Alphonse Hottat, 38, 1050</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col1\" class=\"data row13 col1\" >BE</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row13_col2\" class=\"data row13 col2\" >1.000000</td>\n", "            </tr>\n", "            <tr>\n", "                                <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col0\" class=\"data row14 col0\" >SN, CALLE 55 COBA, JOS\u00c9 MAR\u00cdA MORELOS, Jos\u00e9 Mar\u00eda Morelos, Quintana Roo</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col1\" class=\"data row14 col1\" >MX</td>\n", "                        <td id=\"T_9905dec2_33bd_11eb_956a_a4bb6dafa070row14_col2\" class=\"data row14 col2\" >1.000000</td>\n", "            </tr>\n", "    </tbody></table>"], "text/plain": ["<pandas.io.formats.style.Styler at 0x15485145988>"]}, "execution_count": 19, "metadata": {}, "output_type": "execute_result"}], "source": ["# Here we are picking addresses from validation dataset, but user can pick/create his/her own list \n", "text_list = data._valid_df.sample(15).Address.values\n", "result = model.predict(text_list)\n", "\n", "df = pd.DataFrame(result, columns=[\"Address\", \"CountryCode\", \"Confidence\"])\n", "\n", "df.style.set_table_styles([dict(selector='th', props=[('text-align', 'left')])])\\\n", "        .set_properties(**{'text-align': \"left\"}).hide_index()"]}, {"cell_type": "markdown", "metadata": {}, "source": ["# References"]}, {"cell_type": "markdown", "metadata": {}, "source": ["[1] [Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/)\n", "\n", "[2] [Attention and its Different Forms](https://towardsdatascience.com/attention-and-its-different-forms-7fc3674d14dc)\n", "\n", "[3] [The Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html)\n", "\n", "[4] [Summary of the models](https://huggingface.co/transformers/summary.html)\n", "\n", "[5] [BERT Paper](https://arxiv.org/pdf/1810.04805.pdf)\n", "\n", "[6] [Learning Rate](https://en.wikipedia.org/wiki/Learning_rate)\n", "\n", "[7] [Precision, recall and F1-measures](https://scikit-learn.org/stable/modules/model_evaluation.html#precision-recall-and-f-measures)"]}], "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9"}}, "nbformat": 4, "nbformat_minor": 4}